A new code transformation technique for nested loops
نویسندگان
چکیده
منابع مشابه
A new code transformation technique for nested loops
For good performance of every computer program, good cache utilization is crucial. In numerical linear algebra libraries, good cache utilization is achieved by explicit loop restructuring (mainly loop blocking), but it requires a complicated memory pattern behavior analysis. In this paper, we describe a new source code transformation called dynamic loop reversal that can increase temporal and s...
متن کاملCompiler transformation of nested loops for general purpose GPUs
Manycore accelerators have the potential to significantly improve performance of scientific applications when offloading computationally intensive program portions to accelerators. Directive-based high-level programming models, such as OpenACC and OpenMP, are used to create applications for accelerators through annotating regions of code meant for offloading. OpenACC is an emerging directive-ba...
متن کاملTransformation of Divide & Conquer to Nested Parallel Loops
We propose a sequence of equational transformations and specializations which turns a divide-and-conquer skeleton in Haskell into a parallel loop nest in C. Our initial skeleton is often viewed as general divide-and-conquer. The specializations impose a balanced call tree, a xed degree of the problem division, and elementwise operations. Our goal is to select parallel implementations of divide-...
متن کاملSoftware Pipelining for Nested Loops
In this paper, we present a novel framework of software pipelining for nested loops. Under this framework, a periodic scheduling function, called r-periodic schedule, is associated with each operation of the loop body in the entire iteration space. We present a simple problem formulation as well as e cient solutions which gives provable asymptotically time-optimal schedule for nested loops unde...
متن کاملImperfectly - Nested Loops Yonghong
This paper presents an integrated compiler framework for tiling a class of nontrivial imperfectly-nested loops such that cache locality is improved. We develop a new memory cost model to analyze data reuse in terms of both the cache and the TLB, based on which we compute the tile size with or without array duplication. We determine whether to duplicate arrays for tiling by comparing the respect...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computer Science and Information Systems
سال: 2014
ISSN: 1820-0214,2406-1018
DOI: 10.2298/csis131126075s